Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints
نویسندگان
چکیده
We present a novel approach for unsupervised learning of depth and ego-motion from monocular video. Unsupervised learning removes the need for separate supervisory signals (depth or ego-motion ground truth, or multi-view video). Prior work in unsupervised depth learning uses pixel-wise or gradient-based losses, which only consider pixels in small local neighborhoods. Our main contribution is to explicitly consider the inferred 3D geometry of the scene, enforcing consistency of the estimated 3D point clouds and ego-motion across consecutive frames. This is a challenging task and is solved by a novel (approximate) backpropagation algorithm for aligning 3D structures. We combine this novel 3D-based loss with 2D losses based on photometric quality of frame reconstructions using estimated depth and ego-motion from adjacent frames. We also incorporate validity masks to avoid penalizing areas in which no useful information exists. We test our algorithm on the KITTI dataset and on a video dataset captured on an uncalibrated mobile phone camera. Our proposed approach consistently improves depth estimates on both datasets, and outperforms the stateof-the-art for both depth and ego-motion. Because we only require a simple video, learning depth and ego-motion on large and varied datasets becomes possible. We demonstrate this by training on the low quality uncalibrated video dataset and evaluating on KITTI, ranking among top performing prior methods which are trained on KITTI itself.
منابع مشابه
A Machine Learning Approach to Recovery of Scene Geometry from Images
Recovering the 3D structure of the scene from images yields useful information for tasks such as shape and scene recognition, object detection, or motion planning and object grasping in robotics. In this thesis, we introduce a general machine learning approach called unsupervised CRF learning based on maximizing the conditional likelihood. We describe the application of our machine learning app...
متن کاملVideo Subject Inpainting: A Posture-Based Method
Despite recent advances in video inpainting techniques, reconstructing large missing regions of a moving subject while its scale changes remains an elusive goal. In this paper, we have introduced a scale-change invariant method for large missing regions to tackle this problem. Using this framework, first the moving foreground is separated from the background and its scale is equalized. Then, a ...
متن کاملSelf-Supervised Depth Learning for Urban Scene Understanding
As an agent moves through the world, the apparent motion of scene elements is (usually) inversely proportional to their depth.1 It is natural for a learning agent to associate image patterns with the magnitude of their displacement over time: as the agent moves, far away mountains don’t move much; nearby trees move a lot. This natural relationship between the appearance of objects and their mot...
متن کاملApplication of 3D Human Motion in the Sports based on Computer Aided Analysis
3D human motion tracking is in recent years the field of machine vision is a very important research direction, it has a wide range of applications, such as human-computer interaction, intelligent animation, video surveillance and other. Currently about three-dimensional human motion tracking research mostly based multicast video, monocular video due to the depth information of the lack of the ...
متن کاملCovariance Scaled Sampling for Monocular 3D Body Tracking
We present a method for recovering 3D human body motion from monocular video sequences using robust image matching, joint limits and non-self-intersection constraints, and a new sample-andrefine search strategy guided by rescaled cost-function covariances. Monocular 3D body tracking is challenging: for reliable tracking at least 30 joint parameters need to be estimated, subject to highly nonlin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1802.05522 شماره
صفحات -
تاریخ انتشار 2018